Distance-based Self-Attention Network for Natural Language Inference

نویسندگان

Jinbae Im

Sungzoon Cho

چکیده

Attention mechanism has been used as an ancillary means to help RNN or CNN. However, the Transformer (Vaswani et al., 2017) recently recorded the state-of-theart performance in machine translation with a dramatic reduction in training time by solely using attention. Motivated by the Transformer, Directional Self Attention Network (Shen et al., 2017), a fully attention-based sentence encoder, was proposed. It showed good performance with various data by using forward and backward directional information in a sentence. But in their study, not considered at all was the distance between words, an important feature when learning the local dependency to help understand the context of input text. We propose Distance-based Self-Attention Network, which considers the word distance by using a simple distance mask in order to model the local dependency without losing the ability of modeling global dependency which attention has inherent. Our model shows good performance with NLI data, and it records the new state-of-the-art result with SNLI data. Additionally, we show that our model has a strength in long sentences or documents.

متن کامل

منابع مشابه

DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used in NLP tasks to capture the longterm and local dependencies respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the atte...

متن کامل

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens b...

متن کامل

Character-level Intra Attention Network for Natural Language Inference

Natural language inference (NLI) is a central problem in language understanding. End-to-end artificial neural networks have reached state-of-the-art performance in NLI field recently. In this paper, we propose Characterlevel Intra Attention Network (CIAN) for the NLI task. In our model, we use the character-level convolutional network to replace the standard word embedding layer, and we use the...

متن کامل

Syntax-based Attention Model for Natural Language Inference

Introducing attentional mechanism in neural network is a powerful concept, and has achieved impressive results in many natural language processing tasks. However, most of the existing models impose attentional distribution on a flat topology, namely the entire input representation sequence. Clearly, any well-formed sentence has its accompanying syntactic tree structure, which is a much rich top...

متن کامل

Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference

The RepEval 2017 Shared Task aims to evaluate natural language understanding models for sentence representation, in which a sentence is represented as a fixedlength vector with neural networks and the quality of the representation is tested with a natural language inference task. This paper describes our system (alpha) that is ranked among the top in the Shared Task, on both the in-domain test ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل

عنوان ژورنال:

CoRR

دوره abs/1712.02047 شماره

صفحات -

تاریخ انتشار 2017

Distance-based Self-Attention Network for Natural Language Inference

نویسندگان

چکیده

منابع مشابه

DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Character-level Intra Attention Network for Natural Language Inference

Syntax-based Attention Model for Natural Language Inference

Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference

عنوان ژورنال:

اشتراک گذاری